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^ ' The connection between the statistical physics of disordered systems and optimization problems in computer science 
[t I ' dates back from twenty years at least [1]. In combinatorial optimization one is given a cost function (the length 
. of a tour in the traveling salesman problem (TSP), the number of violated constraints in constraint satisfaction 
problems, . . . ) over a set of variables and looks for the minimal cost over an allowed range for those variables. 
Finding the true minimum may be complicated, and requires bigger and bigger computational efforts as the number 
of variables to be minimized over increases [2]. Statistical physics is at first sight very different. The scope is to 
\mJ ' deduce the macroscopic, that is, global properties of a physical system, for instance a gas, a liquid or a solid, from the 
knowledge of the energetic interactions of its elementary components (molecules, atoms or ions). However, at very 
^ ■ low temperature, these elementary components are essentially forced to occupy the spatial conformation minimizing 
O . the global energy of the system. Hence low temperature statistical physics can be seen as the search for minimizing 
a cost function whose expression reflects the laws of Nature or, more humbly, the degree of accuracy retained in 
I its description. This problem is generally not difficult to solve for non disordered systems where the lowest energy 
^ • conformation are crystals in which components are regularly spaced from each other. Yet the presence of disorder, e.g. 
0^ \ impurities, makes the problem very difficult and finding the conformation with minimal energy is a true optimization 
CNj . problem. 

At the beginning of the eighties, following the works of G. Parisi and others on systems called spin glasses [1], 
important progresses were made in the statistical physics of disordered systems. Those progresses made possible 
CSj ■ the quantitative study of the properties of systems given some distribution of the disorder (for instance the location 
of impurities) such as the average minimal energy and its fluctuations. The application to optimization problems 
was natural and led to beautiful studies on (among others) the average properties of the minimal tour length in the 
TSP, the minimal cost in Bipartite Matching, for some specific instance distributions [1]. Unfortunately statistical 
physicists and computer scientists did not establish close ties on a large scale at that time. The reason could have 
been of methodological nature [3]. While physicists were making statistical statements, true for a given distribution 
of inputs, computer scientists were rather interested in solving one (or several) particular instances of a problem. The 
focus was thus on efficient ways to do so, that is, requiring a computational effort growing not too quickly with the 
number of data defining the instance. Knowing precisely the typical properties for a given, academic distribution of 
instances did not help much to solve practical cases. 

At the beginning of the nineties practitionners in artificial intelligence realized that classes of random constraint 
satisfaction problems used as artificial benchmarks for search algorithms exhibited abrupt changes of behaviour when 
some control parameter were finely tuned [4]. The most celebrated example was random /c-Satisfiability, where one 
looks for a solution to a set of random logical constraints over a set of Boolean variables. It appeared that, for large 
sets of variables, there was a critical value of the number of constraints per variable below which there almost surely 
existed solutions, and above which solutions were absent. An important feature was that the performances of known 
search algorithms drastically worsened in the vicinity of this critical ratio. In addition to its intrinsic mathematical 
interest the random fc-SAT problem was therefore worth to be studied for 'practical' reasons. 

This critical phenomenon, strongly reminiscent of phase transitions in condensed matter physics, led to a revival of 
the research at the interface between statistical physics and computer science, which is still very active. The purpose 
of the present review is to introduce the non physicist reader to some concepts required to understand the literature 
in the field and to present some major results. We shall in particular discuss the refined picture of the satisfiable 
phase put forward in statistical mechanics studies and the algorithmic approach (Survey Propagation, an extension 
of Belief Propagation used in communication theory and statistical inference) this picture suggested. 
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While the presentation will mostly focus on the fc-Satisfiability problem (with random constraints) we will occa- 
sionally discuss another computational problem, namely, linear systems of Boolean equations. A good reason to do 
so is that this problem exhibits some essential features encountered in random fc-Satisfiability, while being technically 
simpler to study. In addition it is closely related to error-correcting codes in communication theory. 

The chapter is divided into four main parts. In Section II we present the basic statistical physics concepts necessary 
to understand the onset of phase transitions, and to characterize the nature of the phases. Those are illustrated on 
a simple example of decision problem, the so-called perceptron problem. In Section III we review the scenario of 
the various phase transitions taking place in random fc-SAT. Section IV and V present the techniques used to study 
various type of algorithms in optimization (local search, backtracking procedures, message passing algorithms). We 
end up with some conclusive remarks in Sec. VI. 



II. PHASE TRANSITIONS: BASIC CONCEPTS AND ILLUSTRATION 
A. A simple decision problem with a phase transition: the continuous perceptron 

For pedagogical reasons we first discuss a simple example exhibiting several important features we shall define more 

formally in the next subsection. Consider M points T} , ■ ■ ■ of the A''-dimensional space M^, their coordinates 
being denoted T" = (T", . . . , T^). The continuous perceptron problem consists in deciding the existence of a vector 
g_ e which has a positive scalar product with all vectors linking the origin of to the T's, 

N 

u-T- = Y,^iT^>^, Va=l,...,M , (1) 

i=l 

or in other words determining whether the M points belong to the same half-space. The term continuous in the name 
of the problem emphasizes the domain of the variable g_. This makes the problem polynomial from worst-case 

complexity point of view [5] . 

Suppose now that the points are chosen independently, identically, uniformly on the unit hypersphere, and call 

P{N, M) = Probability that a set of M randomly chosen points 
belong to the same half-space. 

This quantity can be computed exactly [6] (see also Chapter 5.7 of [5]) and is plotted in Fig. 1 as a function of the 
ratio a = M/N for increasing sizes N = 5, 20, 100. Obviously P is a decreasing function of the number M of points 
for a given size N: increasing the number of constraints can only make more difficult the simultaneous satisfaction of 
all of them. More surprisingly, the figure suggests that, in the large size limit N oo, the probability P reaches a 
limiting value or 1 depending on whether the ratio a lies, respectively, above or below some 'critical' value as = 2. 
This is confirmed by the analytical expression of P obtained in [6], 

min(Ar-l,M-l) 

/ /!/# I \ 

(2) 



1=0 ^ ' 

from which one can easily show that, indeed, 

hm P{N,M = Na)=l^ if a < ^ ^^^j^ = 2 . (3) 

Af^oo 10 II a > as 

Actually the analytical expression of P allows to describe more accurately the drop in the probability as a increases. 
To this aim we make a zoom on the transition region M Ri Nas and find from (2) that 

lim P{N, M = Nas{l + X N-^'^) ) = f e'^'/^ . (4) 

As it should the limits A ±oo gives back the coarse description of Eq. (3) 
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FIG. 1: Probability P{N, M) that M random points on the iV-dimensional unit hypersphere are located in the same half-space. 
Symbols correspond to Cover's exact result [6], see Eq. (2), lines serve as guides to the eye. 

B. Generic definitions 

We now put this simple example in a broader perspective and introduce some generic concepts that it illustrates, 
along with the definitions of the problems studied in the following. 

• Constraint Satisfaction Problem (CSP) 

A CSP is a decision problem where an assignment (or configuration) of N variables a_ = (cri, . . . , (tat) G is 
required to simultaneously satisfy M constraints. In the continuous perceptron the domain of a; is and the 
constraints impose the positivity of the scalar products (1). The instance of the CSP, also called formula in 
the following, is said satisfiable if there exists a solution (an assignment of a fulfilling all the constraints). The 
fc-SAT problem is a boolean CSP {X = {True, False}) where each constraint (clause) is the disjunction (logical 
OR) of k literals (a variable or its negation). Similarly in fc-XORSAT the literals are combined by an eXclusive 
OR operation, or equivalently an addition modulo 2 of 0/1 boolean variables is required to take a given value. 
The worst-case complexities of these two problems are very different (fc-XORSAT is in the P complexity class for 
any k while fc-SAT is NP-complete for any fc > 3), yet for the issues of this review we shall see that they present 
a lot of similarities. In the following we use the statistical mechanics convention and represent boolean variables 
by Ising spins, X = {— A fc-SAT clause will be defined by k indices i-i,...,ik S [^,N] and k values 
Ji-^, . . . , Ji^, = ±1, such that the clause is unsatisfied by the assignment a_ if and only if Ci = Ji Vj S [1, k\. A 
/c-XORSAT clause is satisfied if the product of the spins is equal to a fixed value, . . . = J . 

• random Constraint Satisfaction Problem (rCSP) 

The set of instances of most CSP can be turned in a probabilistic space by defining a distribution over its 
constraints, as was done in the perceptron case by drawing the vertices uniformly on the hypersphere. The 
random fc-SAT formulas considered in the following are obtained by choosing for each clause a independently a 
fc-uplet of distinct indices il,. . . uniformly over the (^) possible ones, and negating or not the corresponding 
literals (J" = ±1) with equal probability one-half. The indices of random XORSAT formulas are chosen 
similarly, with the constant J° = ±1 uniformly. 

• thermodynamic limit and phase transitions 

These two terms are the physics jargon for, respectively, the large size limit (A^ oo) and for threshold 
phenomena as stated for instance in (3). In the thermodynamic limit the typical behavior of physical systems 
is controlled by a small number of parameters, for instance the temperature and pressure of a gas. At a phase 
transition these systems are drastically altered by a tiny change of a control parameter, think for instance at 
what happens to water when its temperature crosses 100 °C . This critical value of the temperature separates 
two qualitatively distinct phases, liquid and gaseous. For random CSPs the role of control parameter is usually 
played by the ratio of constraints per variable, a = M/N, kept constant in the thermodynamic limit. Eq. (3) 
describes a satisfiability transition for the continuous perceptron, the critical value ag = 2 separating a satisfiable 
phase at low a where instances typically have solutions to a phase where they typically do not. Typically is used 
here as a synonym for with high probability, i.e. with a probability which goes to one in the thermodynamic 
limit. 
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• Finite Size Scaling (FSS) 

The refined description of the neighborhood of the critical value of a provided by (4) is known as a finite size 
scaling relation. More generally the finite size scaling hypothesis for a threshold phenomenon takes the form 

lim P{N,M ^ Nasil + XN-^/")) = T{X) , (5) 

JV— »oo 

where v is called the FSS exponent (2 for the continuous perceptron) and the scaling function ^(A) has limits 
1 and at respectively — oo and +oo. This means that, for a large but finite size N, the transition window for 
the values of M/N where the probability drops from 1 — e down to e is, for arbitrary small e, of width N~^/'^ . 
Results of this flavour are familiar in the study of random graphs [7] ; for instance the appearance of a giant 
component containing a finite fraction of the vertices of an Erdos-Renyi random graph happens on a window 
of width N^^/"^ on the average connectivity. FSS relations arc important, not only from the theoretical point 
of view, but also for practical applications. Indeed numerical experiments are always performed on finite-size 
instances while theoretical predictions on phase transitions are usually true in the A/' — > oo limit. Finite-size 
scaling relations help to bridge the gap between the two. We shall review some FSS results in Sec. Ill E. 

Let us emphasize that random fc-SAT, and other random CSP, are expected to share some features of the continuous 
perceptron model, for instance the existence of a satisfiability threshold, but of course not its extreme analytical 
simplicity. In fact, despite an intensive research activity, the mere existence of a satisfiability threshold for random SAT 
formulas remains a (widely accepted) conjecture. A significant achievement towards the resolution of the conjecture 
was the proof by Friedgut of the existence of a non- uniform sharp threshold [8]. There exists also upper [9] and 
lower [10] bounds on the possible location of this putative threshold, which become almost tight for large values of 
k [11]. We refer the reader to the chapter [12] of this volume for more details on these issues. This difficulty to 
obtain tight results with the currently available rigorous techniques is a motivation for the use of heuristic statistical 
mechanics methods, that provide intuitions on why the standard mathematical ones run into trouble and how to 
amend them. In the recent years important results first conjectured by physicists were indeed rigorously proven. 
Before describing in some generality the statistical mechanics approach, it is instructive to study a simple variation 
of the perceptron model for which the basic probabilistic techniques become inefficient. 



C. The perceptron problem continued: binary variables 

The binary perceptron problem consists in looking for solutions of (1) on the hypercube i.e. the domain of the 

variable a_ is = {—1, +1}^ instead of M^. This decision problem is NP-complete. Unfortunately Cover's calcula- 
tion [6] cannot be extended to this case, though it is natural to expect a similar satisfiability threshold phenomenon 
at an a priori distinct value Ug. Let us first try to study this point with basic probabilistic tools, namely the first and 
second moment method [13] . The former is an application of the Markov inequality, 

Prob[Z > 0] < E[Z] , (6) 

valid for positive integer valued random variables Z. We shall use it taking for Z the number of solutions of (1), 

M 

z= E n^(^-^")' (7) 

a^exN a=l 

where 9{x) = 1 if a; > 0, if a; < 0. The expectation value of the number of solutions is easily computed, 

E[Z] = 2^ x2-^ = ^1 with d = (1 - a) In 2 , (8) 

and vanishes when N ^ oo if a > 1. Hence, from Markov's inequality (6), with high probability constraints (1) have 
no solution on the hypercube when the ratio a exceeds unity: if the threshold as exists, it must satisfy the bound 
as < 1. One can look for a lower bound to as using the second moment method, relying on the inequality [13] 

^<Prob[Z>0]. (9) 

The expectation value of the squared number of solutions reads 

E[Z^] = ^{E[e{a-T)e{a' -T)])^ (10) 
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since the vertices T° are chosen independently of each other. The expectation value on the right hand side of the 
above expression is simply the probability that the vector pointing to a randomly chosen vertex, T, has positive scalar 
product with both vectors u_,g!. Elementary geometrical considerations reveal that 

m^-T) ^'-T)] = ^(7r-<^(a,a')) (11) 

where (p is the relative angle between the two vectors. This angle can be alternatively parametrized by the overlap 
between o_ and a', i.e. the normalized scalar product, 

^=l^fl^i^i = ^-'']^fl^i<^i^<) ■ (12) 

i=l i=l 

The last expression, in which 1{E) denotes the indicator function of the event E, reveals the traduction between the 
concept of overlap and the more traditional Hamming distance. The sum over vectors in (10) can then be replaced 
by a sum over overlap values with appropriate combinatorial coefficients counting the number of pairs of vectors at a 
given overlap. The outcome is 

g=-i -1+^ -i+l-,.-.,! ^ ^ 2 // V / 
In the large N limit we can estimate this sum with the Laplace method, 

hm llnE[Z2]= max G2(9) , (14) 

JV— >oo iV — l<q<l 



where 



G.(,)=ln2- ['-±A^A'-^)-('-^W'-' 



2 / \ 2 J V 2 

a In ( - — — Arcosg ) 
\ 2 2tt J 



(15) 



Two conclusions can be drawn from the above calculation: 



• no useful lower bound to ttg can be obtained from such a direct application of the second moment method. 
Indeed, maximization of G2 (15) over q shows that E[Z^] (E[Z])^ when N diverges, whenever a > 0, and in 
consequence the left hand side of (9) vanishes. A possible scenario which explains this absence of concentration 
of the number of solutions is the following. As shown by the moment calculation the natural scaling of Z is 
exponentially large in N (as is the total configuration space A"^). We shall thus denote s = {\nZ)/N the 
random variable of order one counting the log degeneracy of the solutions. Suppose s follows a large deviation 
principle [14] that we state in a very rough way as Prob[s] w exp[A^i(s)], with L{s) a negative rate function, 
assumed for simplicity to be concave. Then the moments of Z are given, at the leading exponential order, by 

lim ^ InEfZ"] = max[X(s) + ns] , (16) 
iV— >(x) TV s 

and are controlled by the values of s such that L'{s) = —n. The moments of larger and larger order n are thus 
dominated by the contribution of rarer and rarer instances with larger and larger numbers of solutions. On 
the contrary the typical value of the number of solutions is given by the maximum of L, reached in a value we 

denote Sg{a): with high probability when N 00, Z is comprised between e^''*^''"^"^' and e''^^**^"-'"'"*^ for any 
e > 0. From this reasoning it appears that the relevant quantity to be computed is 

sJa)= lim — E[lnZl= lim lim— lnE[Z"l. (17) 

This idea of computing moments of vanishing order is known in statistical mechanics as the replica^ method [1]. 
Its non-rigorous implementation consists in determining the moments of integer order n, which are then continued 



The vocable replicas comes from the presence of n copies of the vector a in the calculation of Z" (see the n = 2 case in formula (10)). 



6 



towards n = 0. The outcome of such a computation for the binary perceptron problem reads [15] 

Sg(Q;)=max< - -q{l - q) + I Dz\n{2cosh.{z\/S[)) (18) 

9.9 I 2 

/oo poo ^ 
Dzln / Dy i , 
-oo J Z\J q/{l — q) J 

where Dz = dz e^^ /^/\/27r. The entropy .Sg(Q:) is a decreasing fmiction of a, which vanishes in as ~ 0.833. 
Numerical experiments support this value for the critical ratio of the satisfiable/unsatisfiable phase transition. 

• the calculation of the second moment is naturally related to the determination of the value of the overlap q 
between pairs of solutions (or equivalcntly their Hamming distance, recall Eq. (12)). This conclusion extends 
to the calculation of the n*'' moment for any integer n, and to the n ^ limit. The value of q maximizing the 
r.h.s. of (18), q*{ct), represents the average overlap between two solutions of the same set of constraints (1). 
Actually the distribution of overlaps is highly concentrated in the large N limit around q*{a), in other words 
the (reduced) Hamming distance between two solutions is, with high probability, equal to d* [a] = {1 — q* {a))/2. 
This distance d*{a) ranges from ^ for a = to ~ ;| at a = ag. Slightly below the critical ratio solutions are 
still far away from each other on the hypercube^. 

Note that the perceptron problem is not as far as it could seem from the main subject of this review. There 
exists indeed a natural mapping between the binary perceptron problem and fc-SAT. Assume the vertices T of 
the perceptron problem, instead of being drawn on the hypersphere, have coordinates that can take three values: 
Tj = —1,0, 1. Consider now a fc-SAT formula F. To each clause a of F we associate the vertex T° with coordinates 
= — J° if variable i appears in clause a, otherwise. Of course |T°| = k: exactly k coordinates have non zero 
values for each vertex. Then replace condition (1) with 

N 

J2<^iTr>-{k-l) , Va=l,...,M. (19) 

i=l 

The scalar product is not required to be positive any longer, but to be larger than — (fc — 1). It is an easy check 
that the perceptron problem admits a solution on the hypercube (aj = ±1) if and only if F is satisfiable. While in 
the binary perceptron model all coordinates are non-vanishing, only a finite number of them take non zero values in 
fc-SAT. For this reason fc-SAT is called a diluted model in statistical physics. 

Also the direct application of the second moment method fails for the random fc-SAT problem; yet a refined version 
of it was used in [11], which leads to asymptotically (at large fc) tight bounds on the location of the satisfiability 
threshold. 



D. Prom random CSP to statistical mechanics of disordered systems 

The binary perceptron example taught us that the number of solutions Z oi& satisfiable random CSP usually scales 
exponentially with the size of the problem, with large fluctuations that prevent the direct use of standard moment 
methods. This led us to the introduction of the quenched entropy, as defined in (17). The computation techniques 
used to obtain (18) were in fact developed in an apparently different field, the statistical mechanics of disordered 
systems [1]. 

Let us review some basic concepts of statistical mechanics (for introductory books see for example [16, 17]). A 
physical system can be modeled by a space of configuration q_ E , on which is defined an energy function E{a). For 
instance usual magnets are described by Ising spins Ui = ±1, the energy being minimized when adjacent spins take 
the same value. The equilibrium properties of a physical system at temperature T are given by the Gibbs-Boltzmann 
probability measure on , 

M^) = |exp[-/3i;(a)] , (20) 



^ This situation is very diflfcrcnt from the continuous perceptron case, where the typical overlap q*{a) reaches one when a tends to 2: a 
single solution is left right at the critical ratio. 
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where the inverse temperature (3 equals 1/T and Z is a normaUzation called partition function. The energy function 
E has a natural scaling, linear in the number N of variables (such a quantity is said to be extensive) . In consequence 
in the thermodynamic limit the Gibbs-Boltzmann measure concentrates on configurations with a given energy density 
(e = E/N), which depends on the conjugated parameter (3. The number of such configurations is usually exponentially 
large, « exp[A''s], with s called the entropy density. The partition function is thus dominated by the contribution of 
these configurations, hence lim(ln Z/N) = s — /3e. 

In the above presentation we supposed the energy to be a simple, known function of the configurations. In fact 
some magnetic compounds, called spin-glasses, are intrinsically disordered on a microscopic scale. This means that 
there is no hope in describing exactly their microscopic details, but that one should rather assume their energy to 
be itself a random function with a known distribution. Hopefully in the thermodynamic limit the fluctuations of the 
thermodynamic observables as the energy and entropy density vanish, hence the properties of a typical sample will 
be closely described by the average (over the distribution of the energy function) of the entropy and energy density. 

The random CSPs fit naturally in this line of research. The energy fimction E(g_) of a CSP is defined as the 
number of constraints violated by the assignment g_, in other words this is the cost function to be minimized in the 
associated optimization problem (MAXSAT for instance). Moreover the distribution of random instances of CSP is 
the counterpart of the distribution over the microscopic description of a disordered solid. The study of the optimal 
configurations of a CSP, and in particular the characterization of a satisfiability phase transition, is achieved by taking 
the P ^ oo limit. Indeed, when this parameter increases (or equivalently the temperature goes to 0), the law (20) 
favors the lowest energy configurations. In particular if the formula is satisfiable /z becomes the uniform measures 
over the solutions. Two important features of the formula can be deduced from the behavior of Z at large /?: the 
ground-state energy Eg = mm„E{a), which indicates how good are the optimal configurations, and the ground state 
entropy Sg = ln(|{a; : E{a) = Eg}\), which counts the degeneracy of these optimal configurations. The satisfiability 
of a formula is equivalent to its ground-state energy being equal to 0. In the large N limit these two thermodynamic 
quantities are supposed to concentrate around their mean values (this is proven for E in [18]), we thus introduce the 
associated typical densities, 

eg(a) = lim ^E[Eg] , Sg{a) = lim ^E[Sg] . (21) 

N—>oo iV JV— »(x) iV 

Notice that formula (21) coincides with (17) in the satisfiable phase (where the groimd state energy vanishes). 

Some criteria are needed to relate these thermodynamic quantities to the (presumed to exist) satisfiability threshold 
ag. A first approach, used for instance in [19], consists in locating it as the point where the ground-state energy density 
Eg becomes positive. The assumption underlying this reasoning is the absence of an intermediate, typically UNSAT 
regime, with a sub-extensive positive Eg. In the discussion of the binary perceptron we used another criterion, namely 
we recognized Qg by the cancellation of the ground-state entropy density. This argument will be true if the typical 
number of solutions vanishes continuously at Ofg. It is easy to realize that this is not the case for random /c-SAT: at 
any finite value of a a finite fraction exp[— a/c] of the variables do not appear in any clause, which leads to a trivial 
lower bound (In 2) exp[— a/s] on Sg. This quantity is thus finite at the transition, a large number of solutions disappear 
suddenly at ccg. Even if it is wrong, the criterion Sg{a) = for the determination of the satisfiability transition is 
instructive for two reasons. First, it becomes asymptotically correct at large k (free variables are very rare in this 
limit), this is why it works for the binary perceptron of Section II C (which is, as we have seen, close to fc-SAT with 
k of order A^). Second, it will reappear below in a refined version: we shall indeed decompose the entropy in two 
qualitatively distinct contributions, one of the two being indeed vanishing at the satisfiability transition. 



III. PHASE TRANSITIONS IN RANDOM CSPS 



A. The clustering phenomenon 



We have seen that the statistical physics approach to the perceptron problem naturally provided us with information 
about the geometry of the space of its solutions. Maybe one of the most important contribution of physicists to the 
field of random CSP was to suggest the presence of further phase transitions in the satisfiable regime a < as, affecting 
qualitatively the geometry (structure) of the set of solutions [20-22]. 

This subset of the configuration space is indeed thought to break down into "clusters" in a part of the satisfiable 
phase, a G [adjCts], c^d being the threshold value for the clustering transition. Clusters arc meant as a partition of 
the set of solutions having certain properties listed below. Each cluster contains an exponential number of solutions, 
exp[A/'sint], and the clusters are themselves exponentially numerous, exp[A/'S]. The total entropy density thus decom- 
poses into the sum of Sjnt, the internal entropy of the clusters and S, encoding the degeneracy of these clusters, usually 
termed complexity in this context. Furthermore, solutions inside a given cluster should be well-connected, while two 
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solutions of distinct clusters arc wcU-scparatcd. A possible definition for these notions is the following. Suppose a and 
T are two solutions of a given cluster. Then one can construct a path {g_ = g_Q, g_i, . . . ,il„_i, = r) where any two 
successive are separated by a sub-extensive Hamming distance. On the contrary such a path does not exist if a and 
T belong to two distinct clusters. Clustered configuration spaces as described above have been often encountered in 
various contexts, e.g. neural networks [23] and mean-field spin glasses [24]. A vast body of involved, yet non-rigorous, 
analytical techniques [1] have been developed in the field of statistical mechanics of disordered systems to tackle such 
situations, some of them having been justified rigorously [25-27]. In this literature clusters appear under the name of 
"pure states" , or "lumps" (see for instance the chapter 6 of [25] for a rigorous definition and proof of existence in a 
related model). As we shall explain in a few lines, this clustering phenomenon has been demonstrated rigorously in 
the case of random XORSAT instances [28, 29]. For random SAT instances, where in fact the detailed picture of the 
satisfiable phase is thought to be richer [22], there are some rigorous results [30-32] on the existence of clusters for 
large enough k. 

B. Phcise transitions in random XORSAT 

Consider an instance F of the XORSAT problem [33], i.e. a list of M linear equations each involving k out of N 
boolean variables, where the additions are computed modulo 2. The study performed in [28, 29] provides a detailed 
picture of the clustering and satisfiability transition sketched above. A crucial point is the construction of a core 
subformula according to the following algorithm. Let us denote Fq = F the initial set of equations, and Vq the set of 
variables which appear in at least one equation oi Fq. A sequence Ft,Vt is constructed recursively: if there are no 
variables in Vt which appear in exactly one equation of Ft the algorithm stops. Otherwise one of these "leaf variables" 
(Tj is chosen arbitrarily, Ft+i is constructed from Ft by removing the unique equation in which (Tj appeared, and Vr+i 
is defined as the set of variables which appear at least once in Ft+i- Let us call T* the number of steps performed 
before the algorithm stops, and F' = Ft,, V = Vt, the remaining clauses and variables. Note first that despite 
the arbitrariness in the choice of the removed leaves, the output subformula F' is unambiguously determined by F. 
Indeed, F' can be defined as the maximal (in the inclusion sense) subformula in which all present variables have a 
minimal occurrence number of 2, and is thus unique. In graph theoretic terminology F' is the 2-core of F, the g-core 
of hypergraphs being a generalization of the more familiar notion on graphs, thoroughly studied in random graph 
ensembles in [34]. Extending this study, relying on the approximability of this leaf removal process by differential 
equations [35], it was shown in [28, 29] that there is a threshold phenomenon at ad{k). For a < ad the 2-core F' 
is, with high probability, empty, whereas it contains a finite fraction of the variables and equations for a > aa- Q!d 
is easily determined numerically: it is the smallest value of a such that the equation x = 1 — exp[— afcx*^"^] has a 
non-trivial solution in (0, 1]. 

It turns out that F is satisfiable if and only if F' is, and that the number of solutions of these two formulas are 
related in an enlightening way. It is clear that if the 2-core has no solution, there is no way to find one for the full 
formula. Suppose on the contrary that an assignment of the variables in V that satisfy the equations of F' has been 
found, and let us show how to construct a solution of F (and count in how many possible ways we can do this). 
Set jVo = 1, and reintroduce step by step the removed equations, starting from the last: in the n'th step of this 
new procedure we reintroduce the clause which was removed at step T, — n of the leaf removal. This reintroduced 
clause has d„ = iVr^-n-ij — iVr.-nj > 1 leaves; their configuration can be chosen in 2"^""-^ ways to satisfy the 
reintroduced clause, irrespectively of the previous choices, and we bookkeep this number of possible extensions by 
setting Afn+i = A/'„2''"~^. Finally the total number of solutions of F compatible with the choice of the solution of F' 
is obtained by adding the freedom of the variables which appeared in no equations of F, A/lnt = A/'t,2^~I^°I. Let us 
underline that A/int is independent of the initial satisfying assignment of the variables in V' , as appears clearly from 
the description of the reconstruction algorithm; this property can be traced back to the linear algebra structure of 
the problem. This suggests naturally the decomposition of the total number of solutions of F as the product of the 
number of satisfying assignments of V' , call it A/'coro, by the number of compatible full solutions A/int- In terms of the 
associated entropy densities this decomposition is additive 

S = S -I- Sint , ^ = InA/'core , Sint = InA/jnt , (22) 

where the quantity E is the entropy density associated to the core of the formula. It is in fact much easier technically 
to compute the statistical (with respect to the choice of the random formula F) properties of E and Sjnt once this 
decomposition has been done (the fluctuations in the number of solutions is much smaller once the non-core part of 
the formula has been removed). The outcome of the computations [28, 29] is the determination of the threshold value 
as for the appearance of a solution of the 2-core F' (and thus of the complete formula), along with explicit formulas 
for the typical values of S and s. These two quantities are plotted on Fig. 2. The satisfiability threshold corresponds 
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TABLE I: Critical connectivities for the dynamical, condensation and satisfiability transitions for fc-SAT random formulas. 
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= 4 


9.38 


9.547 


9.93 
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= 5 


19.16 


20.80 


21.12 


k 


= 6 


36.53 


43.08 


43.4 



to the cancellation of S: the number of solutions of the core vanishes continuously at as, while the total entropy 
remains finite because of the freedom of choice for the variables in the non-core part of the formula. 

On top of the simplification in the analytical determination of the satisfiability threshold, this core decomposition 
of a formula unveils the change in the structure of the set of solutions that occurs at a^. Indeed, let us call cluster 
all solutions of F reconstructed from a common solution of F' . Then one can show that this partition of the solution 
set of F exhibits the properties exposed in Sec. Ill A, namely that solutions are well-connected inside a cluster and 
separated from one cluster to another. The number of clusters is precisely equal to the number of solutions of the core 
subformula, it thus undergoes a drastic modification at ad- For smaller ratio of constraints the core is typically empty, 
there is one single cluster containing all solutions; when the threshold ad is reached there appears an exponential 
numbers of clusters, the rate of growth of this exponential being given by the complexity S. Before considering the 
extension of this picture to random SAT problems, let us mention that further studies of the geometry of the space 
of solutions of random XORSAT instances can be found in [36, 37]. 

C. Phase transitions in random SAT 

The possibility of a clustering transition in random SAT problems was first studied in [20] by means of variational 
approximations. Later developments allowed the computation of the complexity and, from the condition of its 
cancellation, the estimation of the satisfiability threshold ag. This was first done for fc = 3 in [21] and generalized for 
/c > 4 in [38], some of the values of as thus computed are reported in Tab. I. A systematic expansion of as at large k 
was also performed in [38]. 

SAT formulas do not share the linear algebra structure of XORSAT, which makes the analysis of the clustering 
transition much more difficult, and leads to a richer structure of the satisfiable phase a < as. The simple graph 
theoretic arguments are not valid anymore, one cannot extract a core subformula from which the partition of the 
solutions into clusters follows directly. It is thus necessary to define them as a partition of the solutions such that each 
cluster is well-connected and well-separated from the other ones. A second complication arises; there is no reason for 
the clusters to contain all the same number of solutions, as was ensured by the linear structure of XORSAT. On the 
contrary, as was observed in [20] and in [39] for the similar random COL problem, one faces a variety of clusters with 
various internal entropies Sjnt- The complexity T, becomes a function of s-mt, in other words the number of clusters 
of internal entropy density Sint is typically exponential, growing at the leading order like exp[AfS(si,it)]- Drawing 
the consequences of these observations, a refined picture of the satisfiable phase, and in particular the existence of a 
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new (so-called condensation) threshold ac G [adjCts], was advocated in [22]. Let us briefly sketch some of these new 
features and their relationship with the previous results of [21, 38]. Assuming the existence of a positive, concave, 
complexity function E(sint), continuously vanishing outside an interval of internal entropy densities [s_, s+], the total 
entropy density is given by 



In the thermodynamic limit the integral can be evaluated with the Laplace method. Two qualitatively distinct 
situations can arise, whether the integral is dominated by a critical point in the interior of the interval [s_, s+], or by 
the neighborhood of the upper limit s+. In the former case an overwhelming majority of the solutions are contained 
in an exponential number of clusters, while in the latter the dominant contributions comes from a sub-exponential 
number of clusters of internal entropy s+, as S(s+) = 0. The threshold ac separates the first regime [ad,ac] where 
the relevant clusters are exponentially numerous, from the second, condensated situation for a G [ofcQis] with a 
sub-exponential number of dominant clusters^. 

The computations of [21, 38] did not take into account the distribution of the various internal entropies of the 
clusters, which explains the discrepancy in the estimation of the clustering threshold between [21, 38] and [22]. 
Let us however emphasize that this refinement of the picture does not contradict the estimation of the satisfiability 
threshold of [21, 38]: the complexity computed in these works is Emax, the maximal value of S(sint) reached at a local 
maximum with E'(.s) = 0, which indeed vanishes when the whole complexity hmction disappears. 

It is fair to say that the details of the picture proposed by statistical mechanics studies have rapidly evolved in the 
last years, and might still be improved. They rely indeed on self-consistent assumptions which are rather tedious to 
check [40] . Some elements of the clustering scenario have however been established rigorously in [30-32] , at least for 
large enough k. In particular these works demonstrated, for some values of k and a in the satisflable regime, the 
existence of forbidden intermediate Hamming distances between pairs of configurations, which are either close (in the 
same cluster) or far apart (in two distinct clusters). 

Note finally that the consequences of such distributions of clusters internal entropies were investigated on a toy 
model in [41], and that yet another threshold Qf > Qd for the appearance of frozen variables constrained to take the 
same values in all solutions of a given cluster was investigated in [42]. 



The statistical mechanics of disordered systems [1] was first developed on so-called fully-connected models, where 
each variable appears in a number of constraints which diverges in the thermodynamic limit. This is for instance 
the case of the perceptron problem discussed in Sec. II. On the contrary, in a random fc-SAT instance a variable is 
typically involved in a finite number of clauses, one speaks in this case of a diluted model. This finite connectivity 
is a source of major technical complications. In particular the replica method, alluded to in Sec. II C and applied to 
random A;-SAT in [19, 20], turns out to be rather cumbersome for diluted models in the presence of clustering [43]. 
The cavity formalism [21, 44, 45], formally equivalent to the replica one, is more adapted to the diluted models. In 
the following paragraphs we shall try to give a few hints at the strategy underlying the cavity computations, that 
might hopefully case the reading of the original literature. 

The description of the random formula ensemble has two complementary aspects: a global (thermodynamic) one, 
which amounts to the computation of the typical energy and number of optimal configurations. A more ambitious 
description will also provide geometrical information on the organization of this set of optimal configurations inside 
the iV-dimensional hypercube. As discussed above these two aspects are in fact interleaved, the clustering affecting 
both the thermodynamics (by the decomposition of the entropy into the complexity and the internal entropy) and the 
geometry of the configuration space. Let us for simplicity concentrate on the a < ttg regime and consider a satisfiable 
formula F. Both thermodynamic and geometric aspects can be studied in terms of the uniform probability law over 
the solutions of F: 




(23) 



D. A glimpse at the computations 



(24) 



o=l 



^ This picture is expected to hold for A; > 4; for fc = 3, the dominant clusters are expected to be of sub-exponential degeneracy in the 
whole clustered phase, hence ac = in this case. 
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where Z is the number of solutions of F, the product runs over its clauses, and Wa is the indicator function of the 
event "clause a is satisfied by the assignment ct" (in fact this depends only on the configuration of the k variables 
involved in the clause a, that wc denote g_^). For instance the (information theoretic) entropy of /i is equal to InZ, 
the log degeneracy of solutions, and geometric properties can be studied by computing averages with respect to fi of 
well-chosen functions of a. 

A convenient representation of such a law is provided by factor graphs [46]. These arc bipartite graphs with two 
types of vertices (sec Fig. 3 for an illustration): one variable node (filled circle) is associated to each of the Boolean 
variables, while the clauses are represented by M constraint nodes (empty squares). By convention we use the indices 
a,b,... for the constraint nodes, ... for the variables. An edge is drawn between variable node i and constraint 
node a if and only if a depends on i. To precise further by which value of ai the clause a gets satisfied one can use 
two type of lincstylcs, solid and dashed on the figure. A notation repeatedly used in the following is da (rcsp. di) 
for the neighborhood of a constraint (resp. variable) node, i.e. the set of adjacent variable (resp. constraint) nodes. 
In this context \ denotes the subtraction from a set. We shall more precisely denote d+i{a) (resp. d-i{a)) the set 
of clauses in di \ a agreeing (rcsp. disagreeing) with a on the satisfying value of cTj, and d^i the set of clauses in di 
which are satisfied by Oi = a. This graphical representation naturally suggests a notion of distance between variable 
nodes i and j, defined as the minimal number of constraint nodes crossed on a path of the factor graph linking nodes 
i and j. 

Suppose now that F is drawn from the random ensemble. The corresponding random factor graph enjoys several 
interesting properties [7]. The degree \di\ of a randomly chosen variable i is, in the thermodynamic limit, a Poisson 
random variable of average ak. If instead of a node one chooses randomly an edge a — i, the outdegree |9i \ a| of i 
has again a Poisson distribution with the same parameter. Moreover the sign of the literals being chosen uniformly, 
independently of the topology of the factor graph, the degrees \d-i-i{a)\ and |9_i(a)| are Poisson random 

variables of parameter ak/2. Another important feature of these random factor graphs is their local tree- like character: 
if the portion of the formula at graph distance smaller than L of a randomly chosen variable is exposed, the probability 
that this subgraph is a tree goes to 1 if L is kept fixed while the size A^ goes to infinity. 

Let us for a second forget about the rest of the graph and consider a finite formula whose factor graph is a tree, as 
is the case for the example of Fig. 3. The probability law n of Eq. (24) becomes in this case a rather simple object. 
Tree structures are indeed naturally amenable to a recursive (dynamic programming) treatment, operating first on 
sub-trees which are then glued together. More precisely, for each edge between a variable node i and a constraint node 
a one defines the amputated tree Fa^i (resp. Fi^a) by removing all clauses in di apart from a (resp. removing only 
a). These subtrees are associated to probability laws ^a^i (resp. Hi^a), defined as in Eq. (24) but with a product 
running only on the clauses present in Fa^i (resp. Fi^a)- The marginal law of the root variable i in these amputated 
probability measures can be parametrized by a single real, as Ui can take only two values (that, in the Ising spin 
convention, are ±1). We thus define these fields, or messages, hi^a and Ua^i, by 

1 - JfCTjtanh/ij^a 1 - Jfo-jtanhua^j 

W^a(o'i) = , fJ-a^iicTi) = , (25) 

where we recall that (Tj = is the value of the literal i unsatisfying clause a. A standard reasoning (see for 
instance [47]) allows to derive recursive equations (illustrated in Fig. 4) on these messages, 

6G9-l-j(a) b^d-i(a) 
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Because the factor graph is a tree this set of equations has a unique solution which can be efficiently determined: one 
start from the leaves (degree 1 variable nodes) which obey the boundary condition /ii_,a = 0, and progresses inwards 
the graph. The law ji can be completely described from the values of the /I's and it's solutions of these equations for 
all edges of the graph. For instance the marginal probability of a, can be written as 

l + o-jtanh/ii I, /o'7\ 

In addition the entropy s of solutions of such a tree formula, can be computed from the values of the messages h and 
u [47]. 

We shall come back to the equations (26), and justify the denomination messages, in Sec. VC; these can be 
interpreted as the Belief Propagation [46, 48, 49] heuristic equations for loopy factor graphs. 

The factor graph of random formulas is only locally tree-like; the simple computation sketched above has thus to 
be amended in order to take into account the effect of the distant, loopy part of the formula. Let us call Fl the factor 
graph made of variable nodes at graph distance smaller than or equal to L from an arbitrarily chosen variable node 
i in a large random formula F, and Bl the variable nodes at distance exactly L from i. Without loss of generality 
in the thermodynamic limit, we can assume that F^ is a tree. The cavity method amounts to an hypothesis on the 
effect of the distant part of the factor graph, F\Fl, i.e. on the boundary condition it induces on Fl- In its simplest 
(so called replica symmetric) version, that is believed to correctly describe the unclustered situation for a < a^, 
F\ Fl is replaced, for each variable node j in the boundary Bl, by a fictitious constraint node which sends a bias 
Uext— ►J- In other words the boundary condition is factorized on the various nodes of such a simple description is 
expected to be correct for a < because, in the amputated factor graph F \ Fl ■ the distance between the variables 
of Bl is typically large (of order IniV), and these variables should thus be weakly correlated. These external biases 
are then turned into random variables to take into account the randomness in the construction of the factor graphs, 
and Eq. (26) acquires a distributional meaning. The messages h (resp. u) are supposed to be i.i.d. random variables 
drawn from a common distribution, the degrees d±i{a) being two independent Poisson random variables of parameter 
ak/2. These distributional equations can be numerically solved by a population dynamics algorithm [44], also known 
as a particle representation in the statistics litterature. The typical entropy density is then computed by averaging s 
over these distributions of h and u. 

This description fails in the presence of clustering, which induces correlations between the variable nodes of Bl in 
the amputated factor graph F \ Fl- To take these correlations into account a refined version of the cavity method 
(termed one step of replica symmetry breaking, in short IRSB) has been developed. It relies on the hypothesis 
that the partition of the solution space into clusters 7 has nice decorrelation properties: once decomposed onto this 
partition, restricted to a cluster 7 behaves essentially as in the unclustered phase (it is a pure state in statistical 
mechanics jargon). Each directed edge a ^ i should thus bear a family of messages u^^^, one for each cluster, or 
alternatively a distribution Qa~ti{u) of the messages with respect to the choice of 7. The equations (26) arc thus 
promoted to recursions between distributions Pi^a{h), Qa^i{u), which depends on a real m known as the Parisi 
breaking parameter. Its role is to select the size of the investigated clusters, i.e. the number of solutions they contain. 
The computation of the typical entropy density is indeed replaced by a more detailed thermodynamic potential, 

$(m) = ^ In^l ^-T = 4 1^ I ^ ^«i°t (.N[^{s,r.,)+ms,r.,] _ (28) 

ry J S- 

In this formula Z-^ denotes the number of solutions inside a cluster 7, and we used the hypothesis that at the leading 
order the number of clusters with internal entropy density Sint is given by exp[iVS(sint)]. The complexity function 
5^(sint) can thus be obtained from $(to) by an inverse Legendre transform. For generic values of m this approach is 
computationally very demanding; following the same steps as in the replica symmetric version of the cavity method 
one faces a distribution (with respect to the topology of the factor graph) of distributions (with respect to the choice 
of the clusters) of messages. Simplifications however arise for m = 1 and m — Q [22]; the latter case corresponds in 
fact to the original Survey Propagation approach of [21]. As appears clearly in Eq. (28), for this value of m all clusters 
are treated on an equal footing and the dominant contribution comes from the most numerous clusters, independently 
of their sizes. Moreover, as we further explain in Sec. VC, the structure of the equations can be greatly simplified in 
this case, the distribution over the cluster of fields being parametrized by a single number. 



E. Finite Size Scaling results 

As we explained in Sec. II B the threshold phenomenon can be more precisely described by finite size scaling 
relations. Let us mention some FSS results about the transitions we just discussed. 
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FIG. 4: A schematic representation of Eq. (26). 



For random 2-SAT, where the satisfiabihty property is known [50] to exhibit a sharp threshold at ag — 1, the width 
of the transition window has been determined in [51]. The range of a where the probabiUty of satisfaction drops 
significantly is of order N~^/^, i.e. the exponent u is equal to 3, as for the random graph percolation. This similarity 
is not surprising, the proof of [51] rcUcs indeed on a mapping of 2-SAT formulas onto random (directed) graphs. 

The clustering transition for XORSAT was first conjectured in [52] (in the related context of error-correcting codes) 
then proved in [53] to be described by 

P{N, M = N{ai + N-^/^\ + N-^/'^5)) = + 0{N-^/^^) , (29) 

where 5 is a subleading shift correction that has been explicitly computed, and the scaling function is, upto a 
multiplicative factor on A, the same error function as in Eq. (4). 

A general result has been proved in [54] on the width of transition windows. Under rather unrestrictive conditions 
one can show that v > 2: the transitions cannot be arbitrarily sharp. Roughly speaking the bound is valid when a 
finite fraction of the clauses are not decisive for the property of the formulas studied, for instance clauses containing 
a leaf variable are not relevant for the satisfiability of a formula. The number of these irrelevant clauses is of order N 
and has thus natural fluctuations of order ^/N ; these fluctuations blur the transition window which cannot be sharper 
than iV-i/2. 

Several studies (see for instance [33, 55, 56]) have attempted to determine the transition window from numeric 
evaluations of the probability P{N,a), for instance for the satisflability threshold of random 3-SAT [55, 56] and 
XORSAT [33]. These studies are necessarily confined to small formula sizes, as the typical computation cost of 
complete algorithms grows exponentially around the transition. In consequence the asymptotic regime of the transition 
window, N~^/" , is often hidden by subleading corrections which are difBcult to evaluate, and in [55, 56] the reported 
values of v were found to be in contradiction with the latter derived rigorous bound. This is not an isolated case, 
numerical studies are often plagued by uncontrolled finite-size effects, as for instance in the bootstrap percolation [57], 
a variation of the classical percolation problem. 

IV. LOCAL SEARCH ALGORITHMS 

The following of this review will be devoted to the study of various solving algorithms for SAT formulas. Algorithms 
are, to some extent, similar to dynamical processes studied in statistical physics. In this context the focus is however 
mainly on stochastic processes that respect detailed balance with respect to the Gibbs-Boltzmann measure [58], a 
condition which is rarely respected by solving algorithms. Physics inspired techniques can yet be useful, and will 
emerge in three different ways. The random walk algorithms considered in this Section arc stochastic processes in 
the space of configurations (not fulfilling the detailed balance condition), moving by small steps where one or a few 
variables are modified. Out-of-equilibrium physics (and in particular growth processes) provide an interesting view on 
classical complete algorithms (DPLL), as shown in Sec. VB. Finally, the picture of the satisfiable phase put forward 
in Sec. Ill underlies the message-passing procedures discussed in Sec. V C. 

A. Pure random walk sat, definition and results valid for all instances 

Papadimitriou [59] proposed the following algorithm, called Pure Random Walk Sat (PRWSAT) in the following, 
to solve fc-SAT formulas: 

1. Choose an initial assignment a;(0) uniformly at random and set T = 0. 
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2. If giT) is a solution of the formula (i.e. E{a{T)) = 0), output SOLUTION and stop. If T = Tmax, a threshold 

fixed beforehand, output UNDETERMINED and stop. 

3. Otherwise, pick uniformly at random a clause among those that are UNSAT in gjT); pick uniformly at random 
one of the k variables of this clause and flip it (reverse its status from True to False and vice-versa) to define 
the next assignment a{T + 1); set T — > T + 1 and go back to step 2. 

This defines a stochastic process g_(T), a biased random walk in the space of configurations. The modification 
ct(T) a{T + 1) in step 3 makes the selected clause satisfied; however the fiip of a variable i can turn previously 
satisfied clauses into unsatisfied ones (those which were satisfied solely by i in a{T)). 

This algorithm is not complete: if it outputs a solution one is certain that the formula was satisfiable (and the 
current configuration provides a certificate of it), but if no solution has been found within the Tmax allowed steps 
one cannot be sure that the formula was unsatisfiable. There are however two rigorous results which makes it a 
probabilistically almost complete algorithm [60]. 

For fc = 2, it was shown in [59] that PRWSAT finds a solution in a time of order 0{N'^) with high probability for 
all satisfiable instances. Hence, one is almost certain that the formula was unsatisfiable if the output of the algorithm 

is UNDETERMINED after Tmax = 0{N'^) stcpS. 

Schoning [61] proposed the following variation for k = 3. If the algorithm fails to find a solution before Tmax = 3A'' 
steps, instead of stopping and printing UNDETERMINED, it restarts from step 1, with a new random initial condition 
a(0). Schoning proved that if after R restarts no solution has been found, then the probability that the instance is 
satisfiable is upper-bounded by exp[— i? x (3/4)^] (asymptotically in N). This means that a computational cost of 
order (4/3)^ allows to reduce the probability of error of the algorithm to arbitrary small values. Note that if the time 
scaling of this bound is exponential, it is also exponentially smaller than the 2^ cost of an exhaustive enumeration. 
Improvements on the factor 4/3 are reported in [62]. 

B. Typical behavior on random fc-SAT instances 

The results quoted above are true for any fc-SAT instance. An interesting phenomenology arises when one applies 
the PRWSAT algorithm to instances drawn from the random fc-SAT ensemble [63, 64]. Figure 5 displays the temporal 
evolution of the number of unsatisfied clauses during the execution of the algorithm, for two random 3-SAT instances 
of constraint ratio a = 2 and 3. The two curves are very different: at low values of a the energy decays rather 
fast towards 0, until a point where the algorithm finds a solution and stops. On the other hand, for larger vahies of 
a, the energy first decays towards a strictly positive value, around which it fluctuates for a long time, until a large 
fluctuation reaches 0, signaling the discovery of a solution. A more detailed study with formulas of increasing sizes 
reveals that a threshold value a^^/ ~ 2.7 (for fc = 3) sharply separates this two dynamical regimes. In fact the fraction 
of unsatisfied clauses (p — E/M, expressed in terms of the reduced time t — T/M, concentrates in the thermodynamic 
limit around a deterministic function (p{t). For a < ctrw the function (pit) reaches at a finite value tso\{a,k), which 
means that the algorithm finds a solution in a linear number of steps, typically close to Ntso\{oi, k). On the contrary 
for a > ttrw the reduced energy (p{t) reaches a positive value (pas{ct, A;) as t — * oo; a solution, if any, can be found only 
through large fluctuations of the energy which occur on a time scale exponentially large in N. This is an example of 
a metastability phenomenon, found in several other stochastic processes, for instance the contact process [65]. When 
the threshold arw is reached from below the solving time tsoi(a, k) diverges, while the height of the plateau (/'as (a, k) 
vanishes when arw is approached from above. 

In [63, 64] various statistical mechanics inspired techniques have been applied to study analytically this phenomenol- 
ogy, some results are presented in Figure 6. The low a regime can be tackled by a systematic expansion of tso\{oi, k) 
in powers of a. The first three terms of these series have been computed, and are shown on the left panel to be in 
good agreement with the numerical simulations. 

Another approach was followed to characterize the transition oirw, and to compute (approximations of) the asymp- 
totic fraction of unsatisfied clauses (fas and the intensity of the fluctuations around it. The idea is to project the 
Markovian evolution of the configuration a(T) on a simpler observable, the energy E{T). Obviously the Marko- 
vian property is lost in this transformation, and the dynamics of E{T) is much more complex. One can however 
approximate it by assuming that all configurations of the same energy E{T) are equiprobable at a given step of 
execution of the algorithm. This rough approximation of the evolution of E{T) is found to concentrate around its 
mean value in the thermodynamic limit, as was constated numerically for the original process. Standard techniques 
allow to compute this average approximated evolution, which exhibits the threshold behavior explained above at a 
value a = (2*^ — l)/k which is, for fc = 3, slightly lower than the threshold arw The right panel of Fig. 6 confronts the 
results of this approximation with the numerical simulations; given the roughness of the hypothesis the agreement is 
rather satisfying, and is expected to improve for larger values of fc. 
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FIG. 5: Fraction of unsatisfied constraints ip — E/M in function of reduced time t — T/M during the execution of PRWSAT 
on random 3-SAT formulas with A'^ = 500 variables. Top: a = 2, Bottom: a = 3. 

The rigorous results on the behavior of PRWSAT on random instances are very few. Let us mention in particular [66] , 
which proved that the solving time for random 3-SAT formulas is typically polynomial up to a = 1.63, a result in 
agreement yet weaker than the numerical results presented here. 

C. More performant variants of the algorithm 

The threshold ctrw for linear time solving of random instances by PRWSAT was found above to be much smaller 
than the satisfiability threshold as- It must however be emphasized that PRWSAT is only the simplest example of 
a large family of local search algorithms, see for instance [67-71]. They all share the same structure: a solution is 
searched through a random walk in the space of configurations, one variable being modified at each step. The choice 
of the flipped variable is made according to various heuristics; the goal is to find a compromise between the greediness 
of the walk which seeks to minimize locally the energy of the current assignment, and the necessity to allow for moves 
increasing the energy in order to avoid the trapping in local minima of the energy function. A frequently encountered 
ingredient of the heuristics, which is of a greedy nature, is the focusing: the flipped variable necessarily belongs to at 
least one unsatisfied clause before the flip, which thus becomes satisfied after the move. Moreover, instead of choosing 
randomly one of the k variables of the unsatisfied clause, one can consider for each of them the effect of the flip, and 
avoid variables which, once flipped, will turn satisfied clauses into unsatisfied ones [67, 68]. Another way to implement 
the greediness [69] consists in bookkeeping the lowest energy found so far during the walk, and forbids flips which will 
raise the energy of the current assignment above the registered record plus a tolerance threshold. These demanding 
requirements have to be balanced with noisy, random steps, allowing to escape traps which are only locally minima 
of the objective function. 

These more elaborated heuristics are very numerous, and depend on parameters that are finely tuned to achieve 
the best performances, hence an exhaustive comparison is out of the scope of this review. Let us only mention that 
some of these heuristics are reported in [69, 70] to efficiently find solutions of large (up to TV = 10^) random formulas 
of 3-SAT at ratio a very close to the satisfiability threshold, i.e. for a < 4.21. 



16 



0.24 
0.21 

tsol 0-18 
0.1.3 
0.12 

0.2 0.4 0.6 0.8 1 1.2 

a 

0.1 



0.07-5 



0.05 



0.025 





23456789 10 

a 

FIG. 6: Top: linear solving time tsoi{a,3) for random 3-SAT formulas in function of a; symbols correspond to numerical 
simulations, solid line to the second order expansion in a obtained in [63]. Bottom: fraction of unsatisfied constraints reached 
at large time for a > Qr„ for random 3-SAT formulas; symbols correspond to numerical simulations, solid line to the approximate 
analytical computations of [63, 64]. 



V. DECIMATION BASED ALGORITHMS 

The algorithms studied in the remaining of the review are of a very different nature compared to the local search 
procedures described above. Given an initial formula F whose satisfiability has to be decided, they proceed by assigning 
sequentially the value of some of the variables. The formula can be simplified under such a partial assignment: clauses 
which are satisfied by at least one of their literal can be removed, while literals unsatisfying a clause are discarded 
from the clause. It is instructive to consider the following thought experiment: suppose one can consult an oracle who, 
given a formula, is able to compute the marginal probability of the variables, in the uniform probability measure over 
the optimal assignments of the formula. With the help of such an oracle it would be possible to sample uniformly the 
optimal assignments of F, by computing these marginals, setting one unassigned variable according to its marginal, 
and then proceed in the same way with the simplified formula. A slightly less ambitious, yet still unrealistic, task is 
to find one optimal configuration (not necessarily uniformly distributed) of F; this can be performed if the oracle is 
able to reveal, for each formula he is questioned about, which of the unassigned variables take the same value in all 
optimal assignments, and what is this value. Then it is enough to avoid setting incorrectly such a constrained variable 
to obtain at the end an optimal assignment. 

Of course such procedures are not meant as practical algorithms; instead of these fictitious oracles one has to 
resort to simplified evidences gathered from the current formula to guide the choice of the variable to assign. In 
Sec. V A we consider algorithms exploiting basic information on the number of occurrences of each variable, and their 
behavior in the satisfiable regime of random SAT formulas. They are turned into complete algorithms by allowing for 
backtracking the heuristic choices, as explained in VB. Finally in Sec. VC we shall use more refined message-passing 
sub-procedures to provide the information used in the assignment steps. 
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A. Heuristic search: the success-to-failure transition 



The first algorithm we consider was introduced and analyzed by Franco and his collaborators [72, 73] . 

1. If a formula contains a unit clause i.e. a claiisc with a single variable, this clause is satisfied through an 
appropriate assignment of its unique variable (propagation); If the formula contains no unit-clause a variable 
and its truth value are chosen according to some heuristic rule (free choice). Note that the unit clause propagation 
corresponds to the obvious answer an oracle would provide on such a formula. 

2. Then the clauses in which the assigned variable appears are simplified: satisfied clauses are removed, the other 
ones are reduced. 

3. Resume from step 1. 

The procedure will end if one of two conditions is verified: 

1. The formula is completely empty (all clauses have been removed), and a solution has been fomid (SUCCESS). 

2. A contradiction is generated from the presence of two opposite unit clauses. The algorithm halts. We do not 
know if a solution exists and has not been found or if there is no solution (failure). 

The simplest example of heuristic is called Unit Clause (UC) and consists in choosing a variable uniformly at 
random among those that are not yet set, and assigning it to true or false uniformly at random. Alore sophisticated 
heuristics can take into account the number of occurrences of each variable and of its negation, the length of the clauses 
in which each variable appears, or they can set more than one variable at a time. For example, in the Generalized 
Unit Clause (GUC), the variable is always chosen among those appearing in the shortest clauses. 

Numerical experiments and theory show that the results of this procedure applied to random A;-SAT formulas with 
ratios a and size A'' can be classified in two regimes: 

• At low ratio a < an the search procedure finds a solution with positive probability (over the formulas and the 
random choices of the algorithm) when N ^ oo. 



• At high ratio a > an the probability of finding a solution vanishes when N 
solutions do exist in the range [a/f,as] but are not found by this heuristic. 



oo. Notice that Uh < ctg: 



The above algorithm modifies the formula as it proceeds; during the execution of the algorithm the current formula 
will contain clauses of length 2 and 3 (we specialize here to A: = 3-SAT for the sake of simplicity but higher values 
of k can be considered). The sub- formulas generated by the search procedure maintain their statistical uniformity 
(conditioned on the number of clauses of length 2 and 3). Franco and collaborators used this fact to write down 
differential equations for the evolution of the densities of 2- and 3-clauses as a function of the fraction t of eliminated 
variables. We do not reproduce those equations here, see [74] for a pedagogical review. Based on this analysis Frieze 
and Suen [75] were able to calculate, in the limit of infinite size, the probability of successful search. The outcome for 
the UC heuristic is 



n^uccLs(a) = exp . 



1 
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(30) 



when 



a < 



|, and V 



for larger ratios. The probability Vg^ 



is, as expected, a decreasing function of a; it 



vanishes in an = |. A similar calculation shows that aH — 3.003 for the GUC heuristic [75]. 

Franco et al's analysis can be recast in the following terms. Under the operation of the algorithm the original 3-SAT 
formula is turned into a mixed 2 +p-SAT formula where p denotes the fraction of the clauses with 3 variables: there 
are Na • (1 — p) 2-clauses and Nap 3-clauses. As we mentioned earlier the simplicity of the heuristics maintains a 
statistical uniformity over the formulas with a given value of a and p. This constatation motivated the study of the 
random 2-}-p-SAT ensemble by statistical mechanics methods [20, 56], some of the results being later confirmed by the 
rigorous analysis of [76]. At the heuristic level one expects the existence of a p dependent satisfiability threshold as{p), 
interpolating between the 2-SAT known threshold, as{p = 0) = 1, and the conjectured 3-SAT case, as{p = 1) ~ 4.267. 
The upperbound agip) < 1/(1 — p) is easily obtained: for the mixed formula to be satisfiable, necessarily the sub- 
formula obtained by retaining only the clauses of length 2 must be satisfiable as well. In fact this bound is tight 
for all values of p G [0, 2/5]. During the execution of the algorithm the ratio a and the fraction p are 'dynamical' 
parameters, changing with the fraction t = T/N of variables assigned by the algorithm. They define the coordinates 
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FIG. 7: Trajectories generated by heuristic search acting on 3-SAT for a = 2 and a = 3.5. For all heuristics, the starting point 
is on the p = 1 axis, with the initial value of a as ordinate. The curves that end at the origin correspond to UC, those ending 
on the p = 1 axis correspond to GUC. The thick line represents the satisfiability threshold: the part on the left of the critical 

point (2/5, 5/3) is exact and coincides with the contradiction line, where contradictions arc generated with high probability, of 
equation a = 1/(1 — p), and which is plotted for larger values of p as well; the part on the right of the critical point is only a 
sketch. When the trajectories hit the satisfiability threshold, at points G for UC and G' for GUC, they enter a region in which 
massive backtracking takes place, and the trajectory represents the evolution prior to backtracking. The dashed part of the 
curves is "unphysical" , i.e. the trajectories stop when the contradiction curve is reached. 



of the representative point of the instance at 'time' t in the {p, a) plane of Figure 7. The motion of the representative 
point defines the search trajectory of the algorithm. Trajectories start from the point of coordinates p(0) = 1, a(0) = a 
and end up on the a = axis when a solution is found. The probabihty of success is positive as long as the 2-SAT 
subformula is satisfiablc, that is, as long as a • (1 — p) < 1. In other words success is possible provided the trajectory 
does not cross the contradiction fine a = 1/(1 — p) (Figure 7). The largest initial ratio a such that no crossing occurs 
defines an- Notice that the search trajectory is a stochastic object. However Franco has shown that the deviations 
from its average locus in the plane vanish in the N oo limit (concentration phenomenon). Large deviations from 
the typical behavior can be calculated e.g. to estimate the probability of success above an [77]. 

The precise form of ^success and the value an of the ratio where it vanishes are specific to the heuristic considered 
(UC in (30)). However the behavior of the probability close to ajj is largely independent of the heuristic (provided 
it preserves the uniformity of the subformulas generated): 

lnPsuccess(a = anil - A)) ~ -\-^'\ (31) 

This imiversality can loosely be interpreted by observing that for a close to an the trajectory will pass very close 
to the contradiction curve a • (1 — p) = 1, which characterizes the locus of the points where the probability that a 
variable is assigned by the heuristics H vanishes (and all the variables are assigned by Unit Propagation). The value 
of aH depend on the "shape" of the trajectory far from this curve, and will therefore depend on the heuristics, but 
the probability of success (i.e. of avoiding the contradiction curve) for values of a close to aR will only depend on the 
local behavior of the trajectory close to the contradiction curve, a region where most variables are assigned through 
Unit Propagation and not sensitive to the heuristics. 

The finite-size corrections to equation (30) are also universal (i.e. independent on the heuristics): 

lnPsucces.(a = aH{l - A), TV) ~ -N^"' TiXN^'"") , (32) 

where is a universal scaling function which can be exactly expressed in terms of the Airy function [78] . This result 
indicates that right at an the probability of success decreases as a stretched exponential ^ exp(— csf N^). 

The exponent ^ suggests that the critical scaling of V is related to random graphs. After T = tN steps of the 
procedure, the sub- formula will consists of C3, C2 and Ci clauses of length 3, 2 and 1 respectively (notice that these 
are extensive, i.e. 0{N) quantities). We can represent the clauses of length 1 and 2 (which are the relevant ones to 



19 



understand the generation of contradictions) as an oriented graph Q in the following way. We will have a vertex for each 
literal, and represent 1-clauses by "marking" the literal appearing in each; a 2-clause will be represented by two directed 
edges, corresponding to the two implications equivalent to the clause (for example, xi V.X2 is represented by the directed 
edges xi X2 and X2 ^ xi). The average out-degree of the vertices in the graph is j = C2/ {N — T) = a{t){l—p{t)). 

What is the effect of the algorithm on Q? The algorithm will proceed in "rounds" : a variable is set by the heuristics, 
and a series of Unit Propagations are performed until no more unit clauses are left, at which point a new round starts. 
Notice that during a round, extensive quantities as Ci, C2, C3 are likely to vary by bounded amounts and 7 to vary 
by 0{jj-) (this is the very reason that guarantees that these quantities are concentrated around their mean). At 
each step of Unit Propagation, a marked literal (say x) is assigned and removed from together with all the edges 
connected to it, and the "descendants" of x (i.e. the literals at the end of outgoing edges) are marked. Also x is 
removed together with its edges, but its descendants are not marked. Therefore, the marked vertices "diffuse" in a 
connected component of Q following directed edges. Moreover, at each step new edges corresponding to clauses of 
length 3 that get simplified into clauses of length 2 are added to the graph. 

When J > 1, G undergoes a directed percolation transition, and a giant component of size 0{N) appears, in which 
it is possible to go from any vertex to any other vertex by following a directed path. When this happens, there is 
a finite probability that two opposite literals x and x can be reached from some other literal y following a directed 
path. If y is selected by Unit Propagation, at some time both x and x will be marked, and this corresponds to a 
contradiction. This simple argument explains more than just the condition 7 = a • (1 — p) = 1 for the failure of the 
heuristic search. It can also be used to explain the the exponent ^ in the scaling (32), see [78, 79] for more details. 

B. Backtrack-bEised seeirch: the Davis-Putnam-Loveland-Logeman procedure 

The heuristic search procedure of the previous Section can be easily turned into a complete procedure for finding 
solutions or proving that formulas are not satisfiable. When a contradiction is found the algorithm now backtracks to 
the last assigned variable (by the heuristic; unit clause propagations are merely conscqiiences of previous assignments), 
invert it, and the search resumes. If another contradiction is found the algorithm backtracks to the last-but-one 
assigned variable and so on. The algorithm stops either if a solution is found or all possible backtracks have been 
unsuccessful and a proof of unsatisfiability is obtained. This algorithm was proposed by Davis, Putnam, Loveland 
and Logemann and is referred to as DPLL in the following. 

The history of the search process can be represented by a search tree, where the nodes represent the variables 
assigned, and the descending edges their values (Figure 8). The leaves of the tree correspond to solutions (S), or to 
contradictions (C). The analysis of the a < au regime in the previous Section leads us to the conclusion that search 
trees look like Figure 8 A at small ratios'*. 

For ratios a > au DPLL is very likely to find a contradiction. Backtracking enters into play, and is responsible 
for the drastic slowing down of the algorithm. The success-to-failure transition takes place in the non-backtracking 
algorithm into a polynomial-to-exponential transition in DPLL. The question is to compute the growth exponent of 
the average tree size, T ~ e''^'^^"^ as a function of the ratio a. 

1. Exponential regime: Unsatisfiable formulas 

Consider first the case of unsatisfiable formulas {a > ag) where all leaves carry contradictions after DPLL halts 
(Figure 8B). DPLL builds the tree in a sequential manner, adding nodes and edges one after the other, and completing 
branches through backtracking steps. We can think of the same search tree built in a parallel way [80]. At time (depth 
T) our tree is composed of L{T) < 2^ branches, each carrying a partial assignment over T variables. Step T consists 
in assigning one more variable to each branch, according to DPLL rules, that is, through unit-propagation or the 
heuristic rule. In the latter case we will speak of a splitting event, as two branches will emerge from this node, 
corresponding to the two possible values of the variable assigned. The possible consequences of this assignment are 
the emergence of a contradiction (which put an end to the branch), or the simplification of the attached formulas (the 
branch keeps growing). 

The number of branches L{T) is a stochastic variable. Its average value can be calculated as follows [81]. Let 
us define the average number L{C;T) of branches of depth T which bear a formula containing C3 (resp. C2, Ci) 



* A small amount of backtracking may be necessary to find the solution since ■Psuccess < 1 [75] , but the overall picture of a single branch 
is not qualitatively affected. 
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FIG. 8: Search trees generated by DPLL: A. linear, satisfiable (a < Qfr); B. exponential, unsatisfiable (a > Qc). C. exponential, 
satisfiable (oh < a < Oc); Leaves are marked with S (solutions) or C (contradictions). G is the highest node to which DPLL 
backtracks, see Figure 7. 



equations of length 3 (resp. 2,1), with C = (Ci, C2, C3) Initially L{C; 0) = 1 for (5 = (0, 0, aN), otherwise. We shall 
call M (C' , C; T) the average number of branches described by C' generated from a C branch once the T*'* variable is 
assigned [79, 80]. We have < M < 2, the extreme values corresponding to a contradiction and to a split respectively. 
We claim that 

L{C';T+l) = Y^M{C',C;T)L{C;T) . (33) 
c 

Evolution equation (33) could look like somewhat suspicious at first sight due to its similarity with the approximation 
we have sketched in Sec. IV B for the analysis of PRWSAT. Yet, thanks to the linearity of expectation, the correlations 
between the branches (or better, the instances carried by the branches) do not matter as far as the average number 
of branches is concerned. 

For large N we expect that the number of alive (not hit by a contradiction) branches grows exponentially with the 
depth, or, equivalently, 

^ L(C7i,C2,C3;T)~e^^W+''W (34) 

The argument of the exponential, X{t), can be found using partial differential equation techniques generalizing the 
ordinary differential equation techniques of a single branch in the absence of backtracking (Section VA). Details 
can be found in [81]. The outcome is that X{t) is a function growing from A = at i = 0. reaching a maximum 
value Am for some depth Im, and decreasing at larger depths. Im is the depth in the tree of Figure 8B where most 
contradictions are found; the number of contradiction leaves is, to exponential order, e^^". We conclude that the 
logarithm of the average size of the tree we were looking for is 

T = Am • (35) 

For large a 3> cks one finds r = 0{l/a), in agreement with the asymptotic scaling of [82]. The calculation can be 
extended to higher values of k. 

2. Exponential regime: Satisfiable formulas 

The above calculation holds for the unsatisfiable, exponential phase. How can we understand the satisfiable but 
exponential regime an < a < ttg? The resolution trajectory crosses the SAT/UNSAT critical line as{p) at some 
point G shown in Figure 7. Immediately after G the instance left by DPLL is unsatisfiable. A subtree with all its 
leaves carrying contradictions will develop below G (Figure 8C). The size of this subtree can be easily calculated 
from the above theory from the knowledge of the coordinates {pG^ aa) of G. Once this subtree has been built DPLL 
backtracks to G, fiips the attached variable and will finally end up with a solution. Hence the (log of the) number of 
splits necessary will be equal to r = {l — tc) [80]. Remark that our calculation gives the logarithm of the average 
subtree size starting from the typical value of G. Numerical experiments show that the resulting value for r coincides 
very acciiratcly with the most likely tree size for finding a solution. The reason is that fluctuations in the sizes are 
mostly due to fluctuations of the highest backtracking point G, that is, of the first part of the search trajectory [77]. 
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C. Message passing algorithms 

According to the thought experiment proposed at the beginning of this Section valuable information could be 
obtained from the knowledge of the marginal probabilities of variables in the uniform measure over optimal configu- 
rations. This is an inference problem in the graphical model associated to the formula. In this field message passing 
techniques (for instance Belief Propagation, or the min-sum algorithm) are widely used to compute approximately 
such marginals [46, 48]. These numerical procedures introduce messages on the directed edges of the factor graph 
representation of the problem (recall the definitions given in Sec. HID), which are iteratively updated, the new value 
of a message being computed from the old values of the incoming messages (see Fig. 4). When the underlying graph 
is a tree, the message updates are guaranteed to converge in a finite number of steps, and provide exact results. 
In the presence of cycles the convergence of these recurrence equations is not guaranteed; they can however be used 
heuristically, the iterations being repeated until a fixed point has been reached (within a tolerance threshold) . Though 
very few general results on the convergence in presence of loops are known [83] (see also [84] for low a random SAT 
formulas) these heuristic procedures are often found to yield good approximation of the marginals on generic factor 
graph problems. 

The interest in this approach for solving random SAT instances was triggered in the statistical mechanics community 
by the introduction of the Survey Propagation algorithm [21]. Since then several generalizations and reinterpretations 
of SP have been put forward, see for instance [85-90]. In the following paragraph we present three different message 
passing procedures, which differ in the nature of the messages passed between nodes, following rather closely the 
prescintation of [47] to which we refer the reader for further details. We then discuss how these procedures have to 
be interleaved with assignment (decimation) steps in order to constitute a solver algorithm. Finally we shall review 
results obtained in a particular limit case (large a satisfiable formulas). 



1. Definition of the message-passing algorithms 

• Belief Propagation (BP) 

For the sake of readability we recall here the recursive equations (26) stated in Sec. HID for the uniform 
probability measure over the solutions of a tree formula, 

hi^a = ^ Ub^i - ^ Ub^i , (36) 

6£9-l-i(a) 6e9_i(a) 

1 / TT 1 ~ ^0^°- 

u.^, = -^in 1- n — 

\ jeda\i 

where the h and u's messages are reals (positive for u), parametrizing the marginal probabilities (beliefs) for the 
value of a variable in absence of some constraint nodes around it (cf. Eq. (25)). These equations can be used 
in the heuristic way explained above for any formula, and constitute the BP message-passing equations. Note 
that in the course of the simplification process the degree of the clauses change, we thus adopt here and in the 
following the natural convention that sums (resp. products) over empty sets of indices are equal to (resp. 1). 

• Warning Propagation (WP) 

The above-stated version of the BP equations become ill-defined for an unsatisfiable formula, whether this 
was the case of the original formula or because of some wrong assignment steps; in particular the normalization 
constant of Eq. (24) vanishes. A way to cure this problem consists in introducing a fictitious inverse temperature 
(3 and deriving the BP equations corresponding to the regularized Gibbs-Boltzmann probability law (20), taking 
as the energy function the number of unsatisfied constraints. In the limit /3 ^ oo, in which the Gibbs-Boltzmann 
measure concentrates on the optimal assignments, one can single out a part of the information conveyed by the 
BP equations to obtain the simpler Warning Propagation rules. Indeed the messages h, u are at leading order 
proportional to /3, with proportionality coefficients we shall denote h and u. These messages are less informative 
than the ones of BP, yet simpler to handle. One finds indeed that instead of reals the WP messages are integers, 
more precisely h G Z and u G {0, 1}. They obey the following recursive equations (with a structure similar to 
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the ones of BP), 

5e9_i(a) 

a < 0) , (37) 

where I(i?) is the indicator function of the event E. The interpretation of these equations goes as follows. Ua^i 
is equal to 1 if in all optimal assignments of the amputated formula in which i is only constrained by a, i takes 
the value satisfying a. This happens if all other variables of clause a (i.e. da\i) arc required to take their values 
unsatisfying a, hence the form of the right part of (37). In such a case we say that a sends a warning to variable 
i. In the first part of (37), the message hi^a sent by a variable to a clause is computed by pondering the number 
of warnings sent by all other clauses; it will in particular be negative if a majority of clauses requires i to take 
the value unsatisfying a. 

• Survey Propagation (SP) 

The convergence of BP and WP iterations is not ensured on loopy graphs. In particular the clustering phe- 
nomenon described in Sec. Ill A is likely to spoil the efficiency of these procedures. The Survey Propagation 
(SP) algorithm introduced in [21] has been designed to deal with these clustered space of configurations. The 
underlying idea is that the simple iterations (of BP or WP type) remain valid inside each cluster of optimal 
assignments; for each of these clusters 7 and each directed edge of the factor graph one has a message h]^^ (and 
Wa_>i). One introduces on each edge a survey of these messages, defined as their probability distribution with 
respect to the choice of the clusters. Then some hypotheses are made on the structure of the cluster decompo- 
sition in order to write closed equations on the survey. Wc explicit now this approach in a version adapted to 
satisfiable instances [47], taking as the basic building block the WP equations. This leads to a rather simple 
form of the survey. Indeed Ua^i can only take two values, its probability distribution can thus be parametrized 
by a single real 6a^i € [0, 1], the probability that Ua^i = 1. Similarly the survey 7i^a is the probability that 
hi^a < 0. The second part of (37) is readily translated in probabilistic terms, 

j^da\i 

The other part of the recursion takes a slightly more complicated form, 

(1 - 7rr )7r:'" 

7i— »a — _|_ _ _|_ _ ; 

f^,t„= n a-Sb 

with { ^^^+'('') 

\^^-.a= n (1-^6 

^ bGd-i{a) 

In this equation tt^^ (resp. tt^^) corresponds to the probability that none of the clauses agreeing (resp. 
disagreeing) with a on the value of the literal of i sends a warning. For i to be constrained to the value 
unsatisfying a, at least one of the clauses of d-i{a) should send a warning, and none of d+i{a), which explains 
the form of the numerator of "fi^a- The denominator arises from the exclusion of the event that both clauses 
in d+i{a) and d-i{a) send messages, a contradictory event in this version of SP which is devised for satisfiable 
formulas. 

From the statistical mechanics point of view the SP equations arise from a IRSB cavity calculation, as sketched 
in Sec. Ill D, in the zero temperature limit (/3 — !■ 00) and vanishing Parisi parameter m, these two limits being 
either taken simultaneously as in [21. 89] or successively [22]. One can thus compute, from the solution of 
the recursive equations on a single formula, an estimation of its complexity, i.e. the number of its clusters 
(irrespectively of their sizes). The message passing procedure can also be adapted, at the price of technical 
complications, to unsatisfiablc clustered formulas [89]. Note also that the above SP equations have been shown 
to correspond to the BP ones in an extended configuration space where variables can take a "joker" value [85, 86], 
mimicking the variables which are not frozen to a single value in all the assignments of a given cluster. Heuristic 
interpolations between the BP and SP equations have been studied in [86, 87]. 



bed+i{a) 
jGda\i 



(39) 
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2. Exploiting the information 

The information provided by these message passing procedures can be exploited in order to solve satisfiability 
formulas; in the algorithm sketched at the beginning of Sec. V A the heuristic choice of the assigned variable, and its 
truth value, can be done according to the results of the message passing on the current formula. If BP were an exact 
inference algorithm, one could choose any unassigned variable, compute its marginal according to Eq. (27), and draw 
it according to this probability. Of course BP is only an approximate procedure, hence a practical implementation of 
this idea should privilege the variables with marginal probabilities closest to a deterministic law (i.e. with the largest 
\hi\), motivated by the intuition that these are the least subject to the approximation errors of BP. Similarly, if the 
message passing procedure used at each assignment step is WP, one can fix the variable with the largest \hi\ to the 
value corresponding to the sign of hi. In the case of SP, the solution of the message passing equations are used to 
compute, for each unassigned variable i, a triplet of numbers (7j^,7i~,7i') according to 



, (l-7r+)7r- _ (l-7r-)7r+ 

h + I - + — ■> h + I - ^ — ■> h h 



„0 _ 1 _ 



with 



<= n (l-'^a^i) 



(40) 



(resp. 7j~) is interpreted as the fraction of clusters in which (Ji — +1 (resp. ai = — 1) in all solutions of the cluster, 
hence 7? corresponds to the clusters in which (Ji can take both values. In the version of [47], one then choose the 
variable with the largest [7^^ ^ 1^1^ ^^'^ it to di = +1 (resp. tXi = —1) if 7^^ > 7,^ (resp. 7^^ < 7^^). In this way 
one tries to select an assignment preserving the maximal number of clusters. 

Of course many variants of these heuristic rules can be devised; for instance after each message passing computation 
one can fix a finite fraction of the variables (instead of a single one), allows for some amount of backtracking [91], or 
increase a soft bias instead of assigning completely a variable [90] . Moreover the tolerance on the level of convergence 
of the message passing itself can also be adjusted. All these implementation choices will affect the performances of the 
solver, in particular the maximal value of a up to which random SAT instances are solved efficiently, and thus makes 
difficult a precise statement about the limits of these algorithms. In consequence we shall only report the impressive 
result of [47], which presents an implementation [92] working for random 3-SAT instances up to a = 4.24 (very close 
to the conjectured satisfiability threshold ttg ~ 4.267) for problem sizes as large a.s N = 10^. 

The theoretical understanding of these message passing inspired solvers is still poor compared to the algorithms 
studied in Sec. VA, which use much simpler heuristics in their assignment steps. One difficulty is the description of 
the residual formula after an extensive number of variables have been assigned; because of the correlations between 
successive steps of the algorithm this residual formula is not uniformly distributed conditioned on a few dynamical 
parameters, as was the case with {a{t),p{t)) for the simpler heuristics of Sec. VA. One version of BP guided decimation 
could however be studied analytically in [93], by means of an analysis of the thought experiment discussed at the 
beginning of Sec. V. The study of another simple message passing algorithm is presented in the next paragraph. 



3. Warning Propagation on dense random formulas 

Feige proved in [94] a remarkable connection between the worst-case complexity of approximation problems and 
the structure of random 3-SAT at large (but independent of N) values of the ratio a. He introduced the following 
hardness hypothesis for random 3-SAT formulas: 

Hypothesis 1: Even if a is arbitrarily large (but independent of N), there is no polynomial time algorithm that on 
most 3-SAT formulas outputs UNSAT. and always outputs SAT on a 3-SAT formula that is satisfiable. 

and used it to derive hardness of approximation results for various computational problems. As we have seen these 
instances are typically unsatisfiable; the problem of interest is thus to recognize efficiently the rare satisfiable instances 
of the distribution. 

A variant of this problem was studied in [95], where WP was proven to be effective in finding solutions of dense 
planted random fornndas (the planted distribution is the uniform distribution conditioned on being satisfied by a 
given assignment). More precisely, [95] proves that for a large enough (but independent of N), the following holds 

with probability 1 — e^'-"-"^: 

1. WP converges after at most 0(ln A^) iterations. 
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2. If a variable i has hi ^ 0, then the sign of hi is equal to the value of Oi in the planted assignment. The number 
of such variables is bigger than 7V(1 — e"*^'"-*) (i.e. almost all variables can be reconstructed from the values of 

hi). 

3. Once these variables are fixed to their correct assignments, the remaining formula can be satisfied in time 0{N) 
(in fact, it is a tree formula). 

On the basis of non-rigorous statistical mechanics methods, these results were argued in [96] to remain true when 
the planted distribution is replaced by the uniform distribution conditioned on being satisfiable. In other words 
by iterating WP for a number of iterations bigger than O(lniV) one is able to detect the rare satisfiable instances 
at large a. The argument is based on the similarity of structure between the two distributions at large a, namely 
the existence of a single, small cluster of solutions where almost all variables are frozen to a given value. This 
correspondence between the two distributions of instances was proven rigorously in [97] , where it was also shown that 
a related polynomial algorithm succeeds with high probability in finding solutions of the satisfiable distribution of 
large enough density a. 

These results indicate that a stronger form of hypothesis 1, obtained by replacing always with with probability p 
(with respect to the uniform distribution over the formulas and possibly to some randomness built in the algorithm), is 
wrong for any p < 1. However, the validity of hypothesis 1 is still unknown for random 3-SAT instances. Nevertheless, 
this result is interesting because it is one of the rare cases in which the performances of a message-passing algorithm 
could be analyzed in full detail. 

VI. CONCLUSION 

This review was mainly dedicated to the random fc-Satisfiability and fc-Xor- Satisfiability problems; the approach 
and results we presented however extend to other random decision problems, in particular random graph g-coloring. 
This problem consists in deciding whether each vertex of a graph can be assigned one out of q possible colors, without 
giving the same color to the two extremities of an edge. When input graphs are randomly drawn from Erdos-Renyi 
(ER) ensemble G{N,p = c/N) a phase diagram similar to the one of fc-SAT (Section III) is obtained. There exists 
a colorable/uncolorable phase transition for some critical average degree Cs{q), with for instance Cs(3) — 4.69 [98]. 
The colorable phase also exhibits the clustering and condensation transitions [99] we explained on the example of the 
fc-Satisfiability. Actually what seems to matter here is rather the structure of inputs and the symmetry properties 
of the decision problem rather than its specific details. All the above considered input models share a common, 
underlying ER random graph structure. From this point of view it would be interesting to 'escape' from the ER 
ensemble and consider more structured graphs e.g. embedded in a low dimensional space. 

To what extent the similarity between phase diagrams correspond to similar behaviour in terms of hardness of 
resolution is an open question. Consider the case of rare satisfiable instances for the random fc-SAT and fc-XORSAT 
well above their sat/unsat thresholds (Section V). Both problems share very similar statistical features. However, while 
a simple message-passing algorithm allows one to easily find a (the) solution for the fc-SAT problem this algorithm is 
inefficient for random fc-XORSAT. Actually the local or decimation-based algorithms of Sections IV and V are efficient 
to find solution to rare satisfable instances of random fc-SAT [100] , but none of them works for random fc-XORSAT 
(while the problem is in P!). This example raises the important question of the relationship between the statistical 
properties of solutions (or quasi-solutions) encoded in the phase diagram and the (average) computational hardness. 
Very little is known about this crucial point; on intuitive grounds one could expect the clustering phenomenon to 
prevent an efficient solving of formulas by local search algorithms of the random walk type. This is indeed true for 
a particular class of stochastic processes [101], those which respect the so-called detailed balance conditions. This 
connection between clustering and hardness of resolution for local search algorithms is much less obvious when the 
detailed balance conditions are not respected, which is the case for most of the efficient variants of PRWSAT. 



[1] M. Mezaxd, G. Paxisi, and M. Virasoro, Spin glass theory and beyond (World Scientific, Singapore, 1987). 

[2] C. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity (Dover, New York, 1998). 

[3] Y. Fu and P. W. Anderson, Journal of Physics A: Mathematical and General 19, 1605 (1986). 

[4] D. Mitchell, B. Selman, and H. Levesque (1992), no. 459 in Proceedings of the Tenth National Conference on Artificial 
Intelligence. 

[5] J. Hertz, A. Krogh, and R. Palmer, Introduction to the theory of neural computation, Santa Fe Institute Studies in the 
Science of Complexity (Addison- Wesley, Redwood city (CA), 1991). 



25 



[6] T. Cover, IEEE Transactions on Electronic Computers 14, 326 (1965). 

[7] S. Janson, T. Luczak, and A. Rucinski, Random graphs (John Wiley and Sons, New York, 2000). 
[8] E. Friedgut, .Journal of the American Mathematical Society 12, 1017 (1999). 
[9] O. Dubois, Theoret. Comput. Sci. 265, 187 (2001). 
[10] J. Franco, Theoret. Comput. Sci. 265, 147 (2001). 

[11] D. Achlioptas and Y. Peres, Journal of the American Mathematical Society 17, 947 (2004). 

[12] Chapter random sat, this volume. 

[13] N. Alon and J. Spencer, The probabilistic method (John Wiley and sons, New York, 2000). 

[14] A. Dembo and O. Zeitouni, Large deviations. Theory and applications (Springer, Berlin, 1998). 

[15] W. Krauth and M. Mezard, J. Physique 50, 3057 (1989). 

[16] S. K. Ma, Statistical Mechanics (World Scientific, Singapore, 1985). 

[17] K. Huang, Statistical Mechanics (John Wiley and Sons, New York, 1990). 

[18] A. Broder, A. Frieze, and E. Upfal (1993), no. 322 in Proceedings of the Fourth Annual ACM-SIAM Symposium on 

Discrete Algorithms. 
[19] R. Monasson and R. Zecchina, Pliys. Rev. E 56, 1357 (1997). 
[20] G. BiroU, R. Monasson, and M. Wcigt, Eur. Phys. J. B 14, 551 (2000). 
[21] M. Mezard and R. Zecchina, Phys. Rev. E 66, 056126 (2002). 

[22] F. Krzakala, A. Montanari, F. Ricci-Tersenghi, G. Semerjian, and L. Zdeborova, Proceedings of the National Academy 

of Sciences 104, 10318 (2007), http://www.pnas.org/cgi/reprint/104/25/10318.pdf. 
[23] R. Monasson and D. O'Kane, Europliysics Letters 27, 85 (1994). 
[24] T. R. Kirkpatrick and D. Thirumalai, Phys. Rev. B 36, 5388 (1987). 
[25] M. Talagrand, Spin glasses: a challenge for mathematicians (Springer, Berlin, 2003). 
[26] D. Panchenko and M. Talagrand, Probab. Theory Relat. Fields 130, 319 (2004). 
[27] S. Franz and M. Leone, J. Stat. Phys. Ill, 535 (2003). 

[28] M. Mezard, F. Ricci-Tersenghi, and R. Zecchina, J. Stat. Phys. Ill, 505 (2003). 
[29] S. Cocco, O. Dubois, J. Mandler, and R. Monasson, Phys. Rev. Lett. 90, 047205 (2003). 
[30] M. Mezard, T. Mora, and R. Zecchina, Physical Review Letters 94, 197205 (pages 4) (2005). 
[31] H. Daude, M. Mezard, T. Mora, and R. Zecchina (2005), arXiv : cond-mat/0506053. 

[32] D. Achlioptas and F. Ricci-Tersenghi, Proceedings of the thirty-eighth annual ACM symposium on Theory of computing 

(2006), arXiv:cs.CC/0611052. 
[33] F. Ricci-Tersenghi, M. Weigt, and R. Zecchina, Phys. Rev. E 63, 026702 (2001). 
[34] B. Pittel, J. Spencer, and N. Wormald, J. Comb. Theory, Ser. B 67, 111 (1996). 

[35] T. Kurtz, J. Appl. Probab. 7, 49 (1970). 

[36] A. Montanari and G. Semerjian, J. Stat. Phys. 124, 103 (2006). 

[37] T. Mora and M. Mezard, Journal of Statistical Mechanics: Theory and Experiment 2006, P10007 (2006). 

[38] S. Mertens, M. Mezard, and R. Zecchina, Random Struct. Algorithms 28, 340 (2006). 

[39] M. Mezard, M. Palassini, and O. Rivoire, Physical Review Letters 95, 200202 (pages 4) (2005). 

[40] A. Montanari, G. Parisi, and F. Ricci-Tersenghi, Journal of Physics A: Mathematical and General 37, 2073 (2004). 

[41] T. Mora and L. Zdeborova (2007), arXiv: 0710. 3804. 

[42] G. Semerjian, J.Stat.Phys. 130, 251 (2008). 

[43] R. Monasson, Journal of Physics A: Mathematical and General 31, 513 (1998). 
[44] M. Mezard and G. Parisi, Eur. Phys. J. B 20, 217 (2001). 
[45] M. Mezard and G. Parisi, J. Stat. Phys. Ill, 1 (2003). 

[46] F. R. Kschischang, B. J. Frcy, and H.-A. Locligcr, IEEE Trans. Inf. Theory 47, 498 (2001). 
[47] A. Braunstcin, M. Mezard, and R. Zecchina, Random Struct. Algorithms 27, 201 (2005). 

[48] J. S. Yedidia, W. T. Freeman, and Y. Weiss, Advances in Neural Information Processing Systems 13, 689 (2001). 
[49] J. S. Yedidia, W. T. Freeman, and Y. Weiss, in Exploring Artificial Intelligence m the New Millennium (2003), p. 239. 
[50] W. Fernandez de la Vega, Theor. Comput. Sci. 265, 131 (2001). 

[51] B. BoUobas, C. Borgs, J. T. Chayes, J. H. Kim, and D. B. Wilson, Random Struct. Algorithms 18, 201 (2001). 
[52] A. Amraoui, A. Montanari, T. Richardson, and R. Urbanke, arXiv:cs. IT/0406050 (2004). 
[53] A. Dembo and A. Montanari, arXiv: math. PR/0702007 (2007). 
[54] D. B. Wilson, Random Struct. Algorithms 21, 182 (2002). 
[55] S. Kirkpatrick and B. Selman, Science 264, 1297 (1994). 

[56] R. Monasson, R. Zecchina, S. Kirkpatrick, B. Selman, and L. Troyansky, Random Struct. Algorithms 15, 414 (1999). 
[57] P. De Gregorio, A. Lawlor, P. Bradley, and K. Dawson, PNAS 102, 5669 (2005). 

[58] L. Cugliandolo, in Slow relaxations and nonequilibrium dynamics in condensed matter, edited by J. L. Barrat, M. Feigel- 

man, J. Kurchan, and J. Dalibard (Springer- Verlag, Les Houches, France, 2003). 
[59] C. Papadimitriou, in Proceedings of the 32th Annual Symposium on Foundations of Computer Science (1991), pp. 163—169. 
[60] R. Motwani and P. Ravaghan, Randomized algorithms (Cambridge University Press, Cambridge, 1995). 
[61] U. Schoning, Algorithmica 32, 615 (2002), ISSN 0178-4617 (print), 1432-0541 (electronic). 
[62] S. Baumer and R. Schuler, Lecture Notes in Computer Science 2919, 150 (2004). 
[63] G. Semerjian and R. Monasson, Phys. Rev. E 67, 066103 (2003). 
[64] W. Barthel, A. K. Hartmann, and M. Weigt, Phys. Rev. E 67, 066104 (2003). 
[65] T. M. Liggett, Interacting particle systems (Springer, Berlin, 1985). 



26 



[66] M. Alekhnovich and E. Ben-Sasson, SIAM Journal on Computing 36, 1248 (2006). 

[67] B. Sclman, H. A. Kautz, and B. Cohen, in Proceedings of the Twelfth National Conference on Artificial Intelligence 

(AAAI'gi) (Seattle, 1994), pp. 337-343. 
[68] D. McAllester, B. Selman, and H. Kautz, in Proceedings of the Fourteenth National Conference on Artificial Intelligence 

(AAAI'g?) (Providence, Rhode Island, 1997), pp. 321-326. 
[69] S. Seitz, M. Alava, and P. Orponen, Journal of Statistical Mechanics: Theory and Experiment 2005, P06006 (2005). 
[70] J. Ardehus and E. Aurell, Physical Review E (Statistical, Nonhnear, and Soft Matter Physics) 74, 037702 (pages 4) 

(2006) . 

[71] M. Alava, J. Ardelius, E. Aurell, P. Kaski, S. Krishnamurthy, P. Orponen, and S. Seitz (2007), arXiv : 0711 .4902. 
[72] M.-T. Chao and J. Franco, SIAM J. Comput. 15, 1106 (1986). 
[73] M.-T. Chao and J. Franco, Inf. Sci. 51, 289 (1990). 
[74] D. Achlioptas, Theor. Comput. Sci. 265, 159 (2001). 
[75] A. Frieze and S. Suen, J. Algorithms 20, 312 (1996). 

[76] D. Achlioptas, L. Kirousis, E. Kranakis, and D. Krizanc, Theor. Comput. Sci. 265, 109 (2001). 

[77] S. Cocco and R. Monasson, Ann. Math. Artif. IntcU. 43, 153 (2005). 
[78] C. Deroulers and R. Monasson, Europhysics Letters 68, 153 (2004). 

[79] R. Monasson, in Complex Systems, edited by J. P. Bouchaud, M. Mezard, and J. Dalibard (Elsevier, Les Houches, France, 
2007). 

[80] S. Cocco and R. Monasson, Phys. Rev. Lett. 86, 1654 (2001). 

[81] R. Monasson, A generating function method for the average-case analysis of DPLL., Lecture Notes in Computer Science 

3624, 402-413 (2005). (2005). 
[82] P. Beame, R. Karp, T. Pitassi, and M. Saks, SIAM Journal of Computing 31, 1048 (2002). 
[83] S. Tatikonda and M. Jordan, in Proc. Uncertainty in Artificial Intell. (2002), vol. 18, pp. 493-500. 
[84] A. Montanari and D. Shah, in SODA (2007), pp. 1255-1264. 

[85] A. Braunstein and R. Zecchina, Journal of Statistical Mechanics: Theory and Experiment 2004, P06007 (2004). 

[86] E. Maneva, E. Mossel, and M. J. Wainwright, in SODA '05: Proceedings of the sixteenth annual ACM-SIAM symposium 

on Discrete algorithms (Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2005), pp. 1089-1098, 

ISBN 0-89871-585-7. 
[87] E. Aurell, U. Gordon, and S. Kirkpatrick, in NIPS (2004). 
[88] G. Parisi (2003), arXiv:cs.CC/0301015. 

[89] D. Battaglia, M. Kolaf, and R. Zecchina, Phys. Rev. E 70, 036107 (2004). 

[90] J. Chavas, C. Furtlehner, M. Mezard, and R. Zecchina, Journal of Statistical Mechanics: Theory and Experiment 2005, 

P11016 (2005). 
[91] G. Parisi (2003), arXiv : cond-mat/0308510. 
[92] URL http://www.ictp.trieste.it/~zecchina/SP. 

[93] A. Montanari, F. Ricci-Tersenghi, and G. Semerjian (2007), eirXiv : 0709 . 1667, to be published in the Proceedings of the 

45th AUerton Conference (2007). 
[94] U. Feige, in STOC (2002), pp. 534-543. 

[95] U. Feigc, E. Mossel, and D. Vilenchik, Complete convergence of message passing algorithms for some satisfiability prob- 
lems., Lecture Notes in Computer Science 4110, 339-350 (2006). (2006). 
[96] F. Altarelli, R. Monasson, and F. Zamponi, Journal of Physics A: Mathematical and Theoretical 40, 867 (2007). 
[97] A. Coja-Oghlan, M. Krivelevich, and D. Vilenchik, Why almost all k-cnf formulas are easy, to appear (2007). 
[98] F. Krzakala, A. Pagnani, and M. Weigt, Phys. Rev. E 70, 046705 (2004). 

[99] L. Zdeborova and F. Krzakala, Physical Review E (Statistical, Nonlinear, and Soft Matter Physics) 76, 031131 (pages 29) 

(2007) . 

[100] W. Barthel, A. K. Hartmann, M. Leone, F. Ricci-Tersenghi, M. Weigt, and R. Zecchina, Phys. Rev. Lett. 88, 188701 
(2002). 

[101] A. Montanari and G. Semerjian, J. Stat. Phys. 125, 23 (2006). 



